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DKA SEQUENCE ENCODING BOVINE a-LACTALBUMIN 
AND METHODS OF USE 
FIELD OF THE INVENTION 
The present invention relates generally to a 
5 DNA sequence encoding bovine a-lactalbumin and to methods 

of producing proteins including recombinant proteins in 
the milk of lactating genetically engineered or 
transgenic mammals . The present invention relates also 
to genetically engineered or transgenic mammals that 
10 secrete the recombinant protein. The present invention 

is also directed to a genetic marker for identifying 
animals with superior milk producing characteristics. 

REFERENCE TO CITED ART 
Reference is made to the section preceding 
15 the CLAIMS for a full bibliography citation of the art 

cited herein. 

DESCRIPTION OF THE PRIOR ART 
a-Lactalbumin is a major whey protein found 
in cow's milk f Eiael et al. . 1984). The term "whey 

20 protein" includes a group of milk proteins that remain 

soluble in "milk serum" or whey after the precipitation 
of casein, another milk protein, at pH 4.6 and 20°C. a- 
Lactalbumin has these characteristics. 

a-Lactalbumin is a secretory protein that 

25 normally comprises about 2.5% of the total protein in 

milk. a-Lactalbumin has been used as an index of mammary 
gland function in response to hormonal regulation in 
bovine explant culture ( Akers et al. . 1981; Goodman et 
al. , 1983) and as an index of udder development (McFadden 

30 et al. . 1986) . a-Lactalbumin interacts with galactosyl 

transferase and therefore plays an essential role in the 
biosynthesis of milk sugar lactose ( Brew, K. and R.L. 
Hill . 1975) . Lactose is an important component in milk, 
and contributes to milk osmolality. It is the most 

35 constant constituent in cow*s milk ( Larson . 1985). a- 

Lactalbumin is useful as an index of lactogenesis in 
cultured mammary tissue ( McFadden et al. . 1987) . It is 
therefore- believed that a-Lactalbumin is an important 
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protein in controlling milk yield and can be used as an 
indicator of mammary function. 

The expression of bovine a-lactalbumin may be 
a potential rate limiting process in dairy cattle. If 
5 greater expression of the a-lactalbumin gene can be 

obtained, then more milk and milk protein could be 
produced. In other words, a-lactalbumin is a potential 
Quantitative Trait Locus (QTL) . 

SUMMARY OF THE INVENTION 
10 One object of the present invention is to 

detect possible genetic differences in the expression of 
bovine a-lactalbumin. 

Another object of the present invention is to 
provide a DNA sequence encoding a mammary specific bovine 
15 a-lactalbumin protein having a specified nucleotide 

sequence* 

It is also an object of the present invention 
to provide a method for genetically engineering the 
incorporation of one or more copies of a construct 
20 comprising an a-lactalbumin control region, which 

construct is specifically activated in the mammary 
tissue . 

These objects and others are addressed by the 
present invention, which is directed to a DNA sequence 
25 encoding bovine a-lactalbumin having a specified 

nucleotide sequence. 

The present invention is also directed to an 
expression vector comprising this DNA sequence. Further, 
the present invention is directed to the protein a- 
30 lactalbumin having the nucleotide sequence. 

The present invention is also directed to an 
expression system comprising a mammary specific a- 
lactalbumin control region which, when genetically 
incorporated into a mammal, permits the female species of 
35 that mammal to produce the desired recombinant protein in 

its milk. 
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Tfae present invention is also directed to a 
genetically engineered or transgenic mammal comprising 
the specified DNA sequence encoding bovine a-lactalbumin. 

The present invention is also directed to a 
5 DNA sequence coding for a-lactalbumin, which is 

operatively linked to an expression system coding for a 
mammary-specific a-lactalbumin protein control, or any 
control region which specifically activates a-lactalbumin 
in milk or in mammary tissue, through a DNA sequence 
10 coding for a signal peptide that permits secretion and 

maturation of the a-lactalbumin in the mammary tissue. 

The present invention is also directed to a 
process for genetically engineering the incorporation of 
one or more copies of a construct comprising an a- 
15 lactalbumin control region which specifically activates 

a-lactalbumin in milk or in mammary tissue. The control 
region is operatively linked to a DNA sequence coding for 
a desired recombinant protein through a DNA sequence 
coding for a signal peptide that permits the secretion 
20 and maturation of a-lactalbumin in the mammary tissue. 

The present invention is also directed to a . 
process for the production and secretion into a mammal's 
milk of an exogenous recombinant protein. The steps 
include producing milk in a genetically engineered or 
25 transgenic mammal. The milk is characterized by an 

expression system comprising a-lactalbumin control 
region. The control region is operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 
through a DNA sequence coding for a signal for the 
30 peptide effective in secreting and maturing the 

recombinant protein in mammary tissue. The milk is then 
collected for use. Alternatively, the exogenous 
recombinant protein is isolated from the milk. 

The present invention is also directed to a 
35 selection characteristic for identifying superior milk 

and milk protein producing animals comprising a DNA 
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sequence encoding bovine a-lactalbumin and having a 
specified nucleotide sequence. 

The present invention is also directed to a 
selection characteristic for identifying superior milk 
5 and milk protein producing mammals- The mammals are 

characterized by inherited genetic material in the DKA 
structure of the mammal. The genetic material encodes at 
least one desired dominant selectable marker for bovine 
a-lactalbumin. One such marker is adenosine, which is 

10 located at the -13 position on the control region of the 

DNA sequence for a-lactalbumin. The present invention is 
also directed to a method of predicting superior milk and 
milk protein production in animals comprising identifying 
the selection characteristic discussed above. 

15 The present invention is further directed to 

a method -for modifying the milk composition in mammals 
which comprises inserting a DNA sequence encoding bovine 
a-lactalbumin having a specified nucleotide sequence. 

The DNA sequence and the various methods of 

20 using it have potentially beneficial uses for dairy 

farmers, artificial insemination organizations, genetic 
marker companies, and embryo transfer and cloning 
companies, to name a few. 

The uses for this genetic marker include the 

25 identification of superior nuclear transfer embryos and 

the identification of superior embryos to clone. 

The present invention also will aid in the 
progeny testing of sires. The specified DNA sequence can 
be used as a genetic marker to identify possible elite 

30 sires in terms of milk production and milk protein 

production. This will increase the reliability of buying 

superior dairy cattle. 

The present invention also will provide 
assistance in farm management decisions, such as sire 
35 selection and selective culling. The physiological 

markers assist in determining future production 
performance in addition to a cow's pedigree. From this 
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information, one could buy or retain a heifer with a DNA 
sequence encoding a-lactalbumin of the present invention 
and consider culling a heifer without the proper 
sequence. 

5 BRIEF DESCRIPTION OF THE D RAWINGS 

Fig- 1 is a schematic illustration of a 
partial restriction map of the bovine a-lactalbumin of 
the present invention. The sequence contains 2.0 
kilobases of a 5 1 flanking region, 1.7 kilobases of a 

10 coding region and 8.8 kilobases of a 3' flanking region. 

Digestion with the Hpa I yields a 2.8 kilobase fragment 
containing the whole 5 1 flanking region. 

Fig. 2 depicts in schematic outline a map of 
the plasmid A-lac Pro/pIC 20R. A Hpa I fragment of the 

15 genomic clone was inserted into the EcoRV site of pIC 

2 OR. The Hpa I fragment contains 2.1 kb of 5» flanking 
DNA the signal peptide coding region of a-lactalbumin and 
8 bases encoding the mature a-lactalbumin protein. Six 
unique enzyme sites are available for attaching various 

20 genes to the sequence. 

Fig. 3 is a schematic illustration of a 
detailed map of the a-lactalbumin 5' flanking control 
region cloned in EcoRV site of the plasmid pIC 20R (SEQ 
ID NO:l, SEQ ID NO: 2) . 

25 Fig. 4 is a schematic illustration of a 

detailed map of the 8.0 kilobase Bglll fragment. 

Fig. 5 depicts the nucleotide sequence (SEQ 
ID NO: 3) of the control/ enhancer region of the bovine a- 
lactalbumin protein. 

30 Fig. 6 depicts in schematic outline a map of 

a plasmid containing bovine a-lactalbumin-bovine B-casein 
gene construct. 

Fig. 7 illustrates a sequence comparison 
between humans and bovine genes in the 5 1 flanking region 

35 of the bovine a-lactalbumin protein between the present 

invention U. S. bovine sequence (SEQ ID NO: 4) , a human 
sequence (SEQ ID NO: 5) and the French bovine (SEQ ID 
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NO:6) for the putative steroid response element and 
between the present invention U. S. bovine sequence (SEQ 
ID NO: 7) , a human sequence (SEQ ID NO: 8) and the French 
bovine (SEQ ID NO: 9) for the RNA polymerase binding 
5 region, surrounding three of the four nucleotide sequence 

variant mutations. 

Fig. 8 is a DOTPLOT™ graph comparing the 
bovine a-lactalbumin 5 1 flanking sequence to the same 
region of the human a-lactalbumin sequence. 
10 Fig. 9 is a DOTPLOT* graph comparing the 

bovine a-lactalbumin 5' flanking sequence to the same 
region of the guinea pig a-lactalbumin sequence. 

Fig. 10 is a DOTPLOT™ graph comparing the 
bovine a-lactalbumin 5 1 flanking sequence to the same 
15 region of the rat a-lactalbumin sequence. 

Fig. 11 is a graph illustrating expression 
levels observed in each of three a-lactalbumin transgenic 
mouse line. 

Fig. 12 is a 4% NuSieve autoradiographic gel 
20 of Mnll digested PCR products. 

Fig. 13 is a graph illustrating a scatter 
plot of each data point in Fig. 12 as well as mean values 
for each of the three genotypes. 

DETAIL DESCRIPTION OF THE PREF ERRED INVENTION 
25 in the Description the following terms are 

employed: 

Genetic engineering, manipulation or 
modification: the formation of new combinations of 
materials by the insertion of nucleic acid molecules 

30 produced outside the cell into any virus, bacterial 

plasmid or other vector system so as to allow their 
incorporation into a host organism in which they do not 
naturally occur, but in which they are capable of 
continued propagation at least throughout the life of the 

35 host organism. Although the term incorporates transgenic 

alteration, the manipulation of the genomic sequence does 
not have to be permanent, i. e., the genetic engineering 
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can affect only the animal which was directly 
manipulated . 

Transgenic animals: permanently genetically 
engineered animals created by introducing new DNA 
5 sequences into the germ line via addition to the egg. 

It is within the scope of the present 
application to use any mammal for the invention. 
Examples of mammals include cows, sheep, goats, mice, 
oxen, camels, water buffaloes, llamas and pigs. 
10 Preferred mammals include those that produce large 

volumes of milk and have long lactating periods. 

The present invention is directed to a gene 
which encodes bovine a-lactalbumin. This gene has been 
isolated and characterized. The 5- flanking region of 
15 the gene has been cloned into six vectors for use as a 

mammary specific control region in the production of 
genetically engineered mammals. To better understand the 
regulation of this control region, 2.0 Jcilobases of the 
5» flanking sequence have been sequenced. The o- 
20 lactalbumin 5' flanking sequence serves as a useful 

mammary-specific "control/ enhancer complex- for 
engineering genetic constructs that could be capable of 
driving the expression of novel and useful proteins in 
the milk of genetically engineered or transgenic mammals. 
25 This results in an increase in milk production and the 

protein composition in milk, a change in the milk and/or 
protein composition in milk, and the production of 
valuable proteins in the milk of genetically engineered 
or transgenic mammals. Such proteins include insulin, 
30 growth hormone, growth hormone releasing factor, 

somatostatin, tissue plasminogen activator, tumor 
necrosis factor, lipocortin, coagulation factors VTII and 
IX, the interferons, colony stimulating factor, the 
interlukens, urokinise, industrial enzymes such as 
35 cellulases, hemicellulases, peroxidases, and thermal 

stable enzymes. 
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The a- lact albumin gene is the preferred gene 
for use in the process because it is a mammary specific 
protein 5 f control region. It also exerts the tightest 
lactational control of all milk proteins. Further , it is 
5 independently regulated from other milk proteins and is 

produced in large quantity by lactating animals. 
Total Sequence 

A gene encoding the milk protein bovine 
a-lactalbumin was isolated from a bovine genomic library 

10 (Woychik, 1982). The Charon 28 lambda library was probed 

using a bovine a-lactalbumin cDNA (Hurley, 1987) and a 
770 base pair a-lactalbumin polymerase chain reaction 
product. The positive lambda clone includes 12.5 
kilobases of inserted bovine sequence, consisting of 2.0 

15 kilobases of a 5 f flanking (control/ enhancer) region, a 

1.7 kilobase coding region and 8.8 kilobases of a 3' 
flanking region. A partial restriction map of the clone 
is illustrated in Fig. 1. 

A 2.8 kilobase Hpa I fragment including the 

20 2.0 kilobase control region along with the signal peptide 

coding region was cloned into the EcoRV site of the 
plasmid pic 2 OR. The plasmid is illustrated in schematic 

outline in Fig- 2. 

An 8.0 kilobase Bgl II fragment containing a 
25 2.0 kilobase 5« flanking control region, a 1.7 kilobase 

coding region, 3.0 kilobases of a 3' flanking region, 1.2; 

kilobases of a lambda DNA has also been isolated. 

Reference is made to figure 4 for a map of the 8.0 

kilobase fragment. Transgenic mice have been produced 
30 using the Bgl II fragment. 

Control /Enhancer Region 

The 2.0 kilobase 5' flanking region has been 

cloned into the vectors Pic 2 OR and Bluescript KS+. A 

schematic illustration of the a-lactalbumin 5» flanking 
35 control region cloned in the EcoRV site of pIC 20R is 

depicted in Figs. 2 and 3 (SEQ ID N0:1, SEQ ID NO: 2). 
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The constructs multiple cloning site, which 
exists downstream of the signal peptide coding region, 
permits various genes to be attached to the a-lactalbumin 
control region. Thus, this vector allows for easy 
5 attachment of specific coding sequences of genes. It 

contains all elements necessary for expression of 
proteins in milk, i.e., a mammary specific control 
region, a mammary specific signal peptide coding region 
and a mature protein-signal peptide splice site which is 

10 able to be cleaved in the mammary gland. The vector also 

contains many unique restriction enzyme sites for ease of 
cloning. Attachment of genes to this control region will 
allow for mammary expression of the genes when these 
constructs are placed into mammals. These vectors also 

15 contain the a-lactalbumin signal peptide coding sequence 

which will allow for proper transport of the expressed 
protein into the milk of the lactating mammal. 

The control region construct has driven 
mammary expression of a desired protein in transgenic 

20 mice. Bovine a-lactalbumin levels of greater than 1 

mg/ml have been observed in the milk of transgenic mouse 
lines as described in Example 2 f infra . ) . Constructs 
containing the 2.0 kilobase region attached to the bovine 
B-casein gene (Bonsing, J., et al., 1988) as well as the 

25 bacterial reporter gene chloramphenicol acetyl 

transferase have been produced in our lab. Fig. 6 is a 
schematic representation of a plasmid containing the 
bovine a-lactalbumin bovine B-casein gene construct. The 
genomic DNA sequence containing the bovine B-casein gene 

30 was attached to the 5» flanking sequence of the bovine a- 

lactalbumin 5 1 flanking sequence. The vector contains 
the polyadenylation site of B-casein along with 
approximately 100 base pairs of 5 1 flanking DNA. The 100 
base pairs of 5 1 flanking DNA is attached to the bovine 

35 a-lactalbumin 5" flanking region at the -100 position. 

The construct uses the proximal promoter elements of B- 
casein and the distal control region elements of a- 
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15 



lactalbumin. The 6-casein construct has been used to 
produce transgenic mice as is illustrated in the 
examples . 

To understand the control of the 
5 control/enhancer region the 2.0 kilobases of 5' flanking 

region were sequenced. A single strand copy of the 
sequence is listed in Fig. 5 (SEQ ID NO: 3). The sequence 
is listed 5' to 3' with the signal peptide coding region 
underlined. 
10 PecrulatQ T-y Sequences 

Potential regulatory sequences contained 
within the 5 '-flanking region of bovine a-lactalbumin 
have been identified. There are possible regulatory 
regions in the introns as well as in the 3 « flanking 
region. Portions of the suspected control regions were 
examined- for possible sequence differences in the 
population which might be related to milk and milk 
protein production of individual cows. The differences 
in the regulatory regions of a-lactalbumin are expected 
to lead to differences in expression of a-lactalbumin 
mRNA. The increased cellular content of mRNA will 
increase the expression of a-lactalbumin protein with a 
concomitant increase in lactose synthase resulting, 
ultimately, in a milk and milk protein production 
increase. This type of mechanism would be considered a 
major gene effect on milk and milk protein production by 
a-lactalbumin. The changes are viewed as causally-linked 
to changes in milk and milk protein production and not 
correlatively-linked. Correlatively-linked traits are 
those which are closely associated with an unknown 
genetic loci which has the direct impact on the 
quantitative trait. 

Sequence differences between the U. S. 
Holstein and the French cow (Vilotte, et al T , 1987) of an 
35 unknown breed were found at four positions within the 5' 

flanking region. One of the identified sequences has a 
sequence which would indicate that it was a steroid 



20 



25 



30 



WO 93/04165 



^ PCT/US92/06549 



-11- 



10 



hormone response element. Two other differences were 
noted in the RNA polymerase binding region and a fourth 
in the signal peptide coding region of the gene. Because 
of the relationship between these sequences and known 
control sequences of mammalian genes, all the variations 
occur in regions one would expect to be involved in 
regulation of the amount of mRNA produced. Further, 
genetic variations which occur in factors binding to 
these regions would also be expected to cause changes. 

Fig. 7 illustrates sequence variants observed 
in the 5' flanking region between the present invention 
U. S. bovine, human (Hall et al. , 1987) and the French 
bovine (Vilotte, 1987) for the putative steroid response 
element (SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 
respectively) and for the RNA polymerase binding region 
(SEQ ID NO:7, SEQ ID NO:8, and SEQ ID NO:9 respectively). 
All of the differences occur in highly conserved portions 
of the gene as seen by comparing this region to the same 
region of the human a-lactalbumin gene. Fig. 7 also 
shows that the positions where the bovine genes differ 
are the same positions the human gene differs from the 
bovine. These data indicate that the bases are part of a 
potentially important control region. 

A method has been devised to give a clearcut 
differentiation between two of the variants at a position 
-13 bases from the start of transcription, i. e., 13 base 
positions from the signal peptide coding region. The two 
variants are termed (e-Lac (-13) A) and (a -Lac (-13) B) . 
The a-lac (-13) A genotype is adenine base at position 
-13 the er-lac (-13) B genotype is either a guanine, 
thymine or cytosine base at -13. They can be 
differentiated with a simple restriction enzyme digest of 
an amplified polymerase chain reaction (PCR) product 
using a specific restriction enzyme (Mnll) . Because of 
35 the specificity of the restriction enzyme Mnll, the 

restriction analysis is unable to distinguish between 
these different possibilities. The a-lac (-13) A allele 
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contains an extra Mnll site at position -13 giving the 
smaller band observed on the gel. 

To amplify the appropriate region of DNA, 
oligonucleotides which frame the sequence of interest 
5 were synthesized. These oligonucleotides were chosen 

because of their specific chemical characteristics. 
These oligonucleotides were then used in a polymerase 
chain reaction to amplify the framed portion of the 
a-lactalbumin gene. The oligonucleotides have the 
10 following sequences: 

a-lac Seq. 1 (SEQ ID NO: 10) 

5 1 ACGCTTGTAAAACGACGGCCAGTTGATTCTCAGTCTCTGTGGT 3 » 
a-lac Seq. 2 (SEQ ID NO: 11) 

5 ' AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3 » 
15 Restriction fragment analysis (Sambrook, J. 

et al., 1989) was used to examine animals from a number 
of breeds of cattle. In most breeds, namely, Jersey, 
Guernsey, Brown Swiss, Simmental and Brahman, only one of 
two genotypes is found. This is the a-lac (-13) B 

20 genotype. However, in the most popular and highest milk 

producing breed of cattle, the Holstein, two genotypes 
occur at this position. The frequency of the A genotype 
was 27% in random samples, while the frequency of the B 
genotype was 73%. Holsteins contain both the genotype 

25 found in the other breeds as well as a separate distinct 

genotype which appears to have arisen within the last 
thirty years in the U. S. Holstein population as 
determined by examining pedigrees of sires currently in 
use. It appears that this genotype has unknowingly been 

30 selected for using traditional animal selection. 

Homozygous and heterozygous animals are found within the 
Holstein population. 

The genotype (a-lac (-13) ) has been examined 
for its correlation with milk and milk protein 

35 production. The three additional variations are being 

examined to determine the frequency of their differences 
in the cattle population and their correlation with milk 



WO 93/04165 



PCT/US92/06549 



-13- 

and milk protein production. The possible linkage of 
these genotypes is also being examined using DNA 
sequencing* The goal of this technology is to identify 
the optimal regulatory genotype for a-lactalbumin and to 
5 select animals with those particular characteristics. 

Detection and Selection of Four Genetic Variants 

The region of sequence where the a-lac (-13) 
variation occurs can be amplified using the polymerase 
chain reaction (PCR) (Sambrook et al., 1989) and two of 
10 the following primers which were developed. Each primer 

allows for amplification of a specific portion of the 
a-lactalbumin gene. Combinations of the listed primers 
can be used in between any two of the primer locations 
listed below. 

15 Primer No.\ Primer sequence Primer location 

(SEQ ID NO: ) (From translation 

start site) 

1 \ (12) 5 1 CTCTTCCTGGATGTAAGGCTT 3 f (-120) - (-100) 

2 \ (13) 5* TCCTGGGTGGTCATTGAAAGGACT 3 • (-2000) - (-1975) 
20 3 \ (14) 5' CAATGTGGTATCTGGCTATTTAGTG 3 1 (-717) -(-692) 

4 \ (15) 5' AGCCTGGGTGGCATGGAATA 3' (+53) -(+33) 

5 \ (16) 5' GAAACGCGGTACAGACCCCT 3» (+453 ) - (+433 ) 

After amplification of the specific region, 
the DNA is either sequenced or digested with restriction 

25 enzymes to detect the sequence differences. In the case 

of the a-lac (-13) variation, the sequence difference can 
be seen using the restriction enzyme Mnll (5 , CTCC 3 f 
recognition site) • The PGR DNA product is digested with 
Mnll and then run on a 4% NuSieve agarose gel to observe 

30 the polymorphism. 

A 650 base pair sequence containing all four 
of the variations is being examined using a unique 
sequencing technique. PCR is initially used to amplify a 
770 base pair portion of the a-lactalbumin 5" flanking 

35 region. Another PGR reaction is then performed using a 

portion of the initial reaction and the following primers 
(SEQ ID NO: 10 and SEQ ID NO: 11 respectively): 

a-lac Beg. 1 5 ■ ACGCTTGTAAAACGACGGCCAGTTGATTCTCAGTCTCTG7XK5T 3* 
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ct-lac seq. 2 5 ' AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3' 

The primers listed above contain a portion of 
the a-lactalbumin gene as well as both M13 DNA sequencing 
primers. The primers are designed to allow for DNA 
5 sequencing to be performed in both directions on the PCR 

DNA product- The final PCR product will contain the 
region of a-lactalbumin containing the four genetic 
variants, the two M13 sequencing priming regions and 5 
"dummy bases" on the end to aid in the M13 primer 

10 binding . 

Comparison of Hiohlv Conser ved Portions of the 5' 
Flanking Region of g-Lactalbumi n Between Species 

Reference is made to Figs. 8-10 for 
DOTBLOT™ graphs comparing the bovine a-lactalbumin 
15 sequence to the same region of the human (Fig. 8) , guinea 

pig (Fig. 9), and rat (Fig. 10). The region in Fig. 8 
(human) spans 819 base pairs. The sequences are highly 
conserved to about 700 base pairs. The region in Fig. 9 
(guinea pig) spans 1381 base pairs. The sequences are 
20 highly conserved to about 700 base pairs, but then 

diverge. The region in Fig. 10 (rat) spans 1337 base 
pairs, The sequences are highly conserved to about 700 
base pairs, but then diverge. Species differences in 
control regions would be expected to occur in non- 
25 conserved regions of the sequence. 

Pom paT-ison of 5' Flanking Regi on of Bovine a-Lactalbumin 
to Other Bovine Milk P rotein Genes 

Portions of the 5' flanking region of the 
other bovine milk protein genes (asl and as2 casein, 
30 B-casein, JC-casein and B-lactoglobulin) which are highly 

conserved with the a-lactalbumin 5' flanking region were 
identified. It is probable that sequence differences 
within these regions will also have an effect on mRNA 
production as well as final protein production. Two 
35 examples of these highly homologous regions are listed 

below. 

The bovine a-lactalbumin sequence from (-161) 
- (-115) (SEQ ID NO: 17) compared to the bovine B-casei« 
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sequence (SEQ ID NO: 18) corresponding to the same region 
of the gene. Percent similarity is 69% over 46 bases. 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCT 

i i i i lit i i i i i i !i i i ! • J J J ! \ \ \ \\ \\\ 
III! Ill I I I I I I II i i i i i i i i i ii ii i i ■ 

5 AGGAGGCT . ATTCTTTCCTTTTAGTCTATACTGTCTTCGCTCTTCA 

The bovine a-lactalbumin sequence (SEQ ID 
NO: 19) from (-1420) - (-1351) is compared to the bovine 
B-casein sequence (SEQ ID NO: 20) corresponding to the 
same region of the gene. Percent similarity is 75% over 

10 69 bases. 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCA 
i i 1 1 i i i i i i i I I I I i I i !!!!!!!! 

I I I I I I I I I I I I I I I I I I ii i m 1 1 1 

TCTCAGAAATCACACTTTTTTGCCTGTG GCCTTGGCA 



1 c I I I I I I I I I I I I I I I 

15 i i I i I I I I i I I I I i I 



TACCAGAAGCTAACAGCTA 

i i i i i i i i I i i i i ii 
i i i i i I I i i I I I I ii 

. ACCAAAAGCTAACACATA 



The included data indicate that the bovine 
a-lactalbumin gene will be useful as selection tool in 
the dairy cattle industry as well as a valuable 

20 control/enhancer and gene to be used in the field of 

genetically engineered mammals. The control region we 
have cloned contains the necessary regulatory elements to 
express genes in the milk of genetically engineered 
mammals as well as the "high expressing genotype" as 

25 shown by our milk and milk protein production and 

sequence variation data. These facts make this a useful 
gene in both industrial and research areas. Application 
of these techniques to the other milk proteins will allow 
for the selection of valuable genotypes corresponding to 

30 the B-casein , as x - and as 2 -casein and JC-casein genes and 

the B-lactoglobulin genes. 
Coding Region 

The coding region of the a-lactalbumin 
protein includes a 1.7 kilobase sequence. 

35 3 y Flanking Region 

The 3» flanking region is an 8.8 kilobase 
flanking region downstream of the DNA sequence coding for 
the desired recombinant protein. This region apparently 
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stabilizes the RNA transcript of the expression system 
and thus increases the yield of desired protein from the 
expression system. 
Operation 

The above-described expression systems may be 
prepared by methods well-known in the art. Examples 
include various ligation techniques employing 
conventional linkers, restriction sites , etc. 
Preferably, these expression systems are part of larger 
plasmids . 

After isolation and purification, the 
expression systems or constructs are added to the gene 
pool which is to be genetically altered. 

The methods for genetically engineering 
mammals are well-known to the art. Reference is made to 
to Alberts, B. et al., 1989 and Lewin, B. 1990, for 
textbook descriptions of genetic engineering and 
transgenic alteration of animals. Briefly, genetic 
engineering involves the construction of expression 
vectors so that a cDNA clone or genomic structure is 
connected directly to a DNA sequence that acts as a 
strong promoter for DNA transcription. By means of 
genetic engineering, mammalian cells, such as mammary 
tissue, can be induced to make vast quantities of useful 
proteins . 

For the purposes of this invention, the term 
"genetic engineering," as defined supra, in the list of 
definitions, includes single line alteration, i. e. , 
genetic alteration only during the life of the affected 
animal with no germ line permanence. The construct can 
be genetically incorporated in mammalian glands such as 
mammary glands and mammalian stem cells. 

Genetic engineering also includes transgenic 
alteration, i. e. the permanent insertion of the gene 
sequence into the genomic structure of the affected 
animal and any offspring. Transgenically altering a 
mammal involves microinjecting a DNA construct into the 
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pronuclei of the fertilized mammalian egg to cause one or 
more copies of the construct to be retained in the cells 
of the developing mammal. In a transgenic animal, the 
engineered genes are permanently inserted into the germ 

5 line of the animal. 

The genetically engineered mammal is then 
characterized by an expression system comprising the o- 
lactalbumin control region operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 

10 through a DNA sequence coding for a signal peptide 

effective in secreting and maturing the recombinant 
protein in mammary tissue. In order to produce and 
secrete the recombinant protein into the mammal's milk, 
the transgenic mammal must be allowed to produce the 

15 milk, after which the milk is collected. The milk may 

then be used in standard manufacturing processes. The 
exogenous recombinant protein may also be isolated from 
the milk according to methods known to the art. 
Sol action Characteristics 

2 0 The a-lactalbumin control/enhancer sequence 

of Fig. 1 is also important as a selection characteristic 
for identifying superior or elite milk producing mammals. 
Presently, those in the dairy cattle business can only 
rely on pedigree information, which is frequently not 

25 available, to predict milk and milk protein production in 

mammals, specifically the bovine species. The study of 
physiological markers as a means for determining milk and 
milk protein production has received some interest. The 
most common physiological marker traits studied in dairy 

30 cattle are hormones, enzymes, and different blood 

metabolites. Components of the immune system have also 
been studied. Traits listed as possible marker traits 
for milk yield include thyroxine, blood urea nitrogen, 
growth hormones, insulin-like growth factors and insulin, 

35 and glucose and free fatty acids. While these techniques 

have shown some advances in predicting milk and milk 
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protein production in a dairy animal, there is currently 
no other reliable means to predict these characteristics. 

The present invention provides a selection 
characteristic for identifying superior milk and milk 
5 protein-producing mammals comprising inherited genetic 

material which is DNA occurring in the genetic structure 
of the mammal in which the genetic material encodes a 
dominant selectable marker for bovine a-lactalbumin. 

The DNA sequence disclosed herein serves as a 
0 characteristic marker for elite milk producing mammals. 

The examples below describe the invention 
disclosed herein, although the invention is not to be 
understood as limited in any way to the terms and scope 
of the examples. 
5 EXAMPLES 

Example 1: a-lac (-13) variation study. 

Forty- two mammals were selected in a 
stratified random manner to provide mammals of a wide 
range of milk and milk protein production capabilities 
0 within the UW herd. 

DNA was isolated according to procedures 
known to the art from a random sample of 42 Holstein 
dairy cows in the University of Wisconsin-Madison herd. 
Each mammal was genotyped as described previously for the 
5 a-lactalbumin (-13) variation using a 4% NuSieve gel of 

Mnll digested PCR products. 

The gene frequency in this population is 28% 
for the ct-lac (-13) A and 72% for the a-lac (-13) B. 
Each of the distinct genotypes are shown on the gel in 
Fig- 12. The legend for the gel of Figure 12 is as 
follows: 

Lane 1 Molecular Weight Standards 

Lane 2-3 heterozygous a-lac (-13) AB 

Lane 4: homozygous a-lac (-13) BB 

Lane 5 heterozygous a-lac (-13) AB 

Lane 6 homozygous a-lac (-13) BB 

Lane 7 homozygous a-lac (-13) AA 



0 



5 
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Lane 8 heterozygous a-lac (-13) AB 

Analysis of the genetic capabilities of the 
42 mammals indicates a possible major gene effect caused 
by the a-lac (-13) allele or linked to the a-lac (-13) 
5 allele. A scatter plot of each data point as well as 

mean values for each of the three genotypes is 
illustrated in Fig. 13. Holstein cows were compared 
using their predicted transmitting ability for milk. 

The data indicate that the a-lac (-13) A 

10 genotype is the preferred genotype for milk and milk 

protein production. Table 1 shown below indicates the 
statistical association of differences in milk and milk 
protein production ability observed between each of the 
genotypes for the traits listed below. Analysis of 

15 variance and T tests (LSD) were performed on the data. 

All of the production yield traits were positively 
correlated with the a-lac (-13) A allele. Milk protein 
percentage was negatively correlated to the a-lac (-13) A 
allele. 

20 Table 1 

Tr a it / Genotype Genotype 

a-lac (-13) AA a-lac (-13) AB a-lac (-13) BB 

PTA (Milk) /AA 

PTA (Milk) /AB 
25 ME305 (Milk) /AA 

ME305 (Milk) /AB 

PTA (Protein #)/AA 

PTA (Protein #)/AB 

PTA (Protein %)/AA 
30 PTA (Protein %)/AB 

Example 2. Production of Transgenic mice to study the 
regulation of bovine a-lactalbumin gene expression. 
Genomic Library Screening: 

The gene encoding the milk protein bovine 
35 a-lactalbumin was isolated from a bovine genomic library 

(Woychik, 1982) . The genomic library was screened 



N.S. p<0.02 

N.S. p<0.02 

N.S. N.S. 

N.S. p<0.1 

N.S. N.S. 

N.S. P<0.1 

N.S. p<0.01 

N.S. p<0.01 
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according to the following procedure. Approximately 1.5 
million lambda plaques were transferred to nylon 
membranes using procedures described by Maniatis et al. 
(1989). The a-lactalbumin cDNA (Hurley, 1987) or a 770 
5 base pair PCR product was nick translated (BEL) with 

a-P32 labeled dCTP. Blots were prehybridized overnight 
(65C) then hybridized for 16 hours at 65C. Blots were 
washed (Twice in 2X SSC 1% SDS, Once in 0.1X SSC 0.1% 
SDS) at 65C and placed on Kodak X-OMAT film for 

10 autoradiography. A 8.0 kilobase fragment containing the 

a-lactalbumin gene was purified as illustrated in Fig. 4. 
The 8.0 kilobase fragment contained 2.1 kilobases of 5» 
flanking region, the 1.7 kilobase coding region and 2.6 
kilobases of 3' flanking region. 

!5 Production of transgeni c mice: 

Mature C57B6 X DBA2J Fl (B6D2) female were 
superovulated (PMSG and hCG) and mated with ICR or B6D2 
males to yield fertilized eggs for pronuclear 
microinjection. The eggs were microinjected using a 

20 Leitz micromanipulator and a Nikon inverted microscope. 

Forty normal appearing two cell embryos were transferred 
to each pseudopregnant recipient. (University of 
Wisconsin-Madison Biotechnology Center Transgenic Mouse 
Facility, Dr. Jan Heideman) . 

25 firreenina of transg enic, mic* using PCR: 

Tail DNA was extracted using the method 
described by Constantini et al. (1986). Polymerase chain 
reaction (PCR) was performed using 10 ml lOx PCR reaction 
buffer (Promega Corp., Madison, WI.), 200 mM each dNTP 

30 (Pharmacia Intl., Milwaukee, WI. ) , 1.0 /xm each primer 

(upstream primer 25mer -712 to -687 (5' 

CAATGTGGTATCTGGCTATTTAGTG 3') (SEQ ID NO: 14) , downstream 
primer 20mer +39 to +59 (5« AGCCTGGGTGGCATGGAATA 3') (SEQ 
ID NO: 15), 1 unit Taq DNA polymerase (Promega Corp., 
35 Madison, WI.) and Img genomic DNA. Volume was adjusted 

to 100 ml with double distilled sterile water and 
reaction was overlaid with heavy mineral oil. Samples 
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were subjected to 30 cycles (94C 2 min. , 50C 1.5 min., 
72C 1.5 min.)* Products were run in 2m 1% agarose gel 
and stained with ethidium bromide. 
Mouse Milking: 

5 The mice were separated from their litters 

for four hours and then anesthetized (0.01 ml/g body 
weight I. P. injection of 36% propylene glycol, 10.5% 
ethyl alcohol (95%) , 41.5% sterile water, and 12% sodium 
pentabarbitol (50 mg/ml) ) . After being anesthetized the 

10 mice were injected I.M. with 0.3 I.U. oxytocin and milked 

using a small vacuum milking machine. Three of fifty-one 
live offspring were identified as being transgenic using 
polymerase chain reaction. Reference is made to Fig. 14 
for a graph illustrating expression levels observed in 

15 each of the 3 a-lactalbumin transgenic mouse line. 

ELISA: 

Second generation mammals from one line were 
milked and analysis was performed using an ELISA (enzyme 
linked immunosorbent assay) for bovine a-lactalbumin 
20 according to the following procedure: 

1. Coat 1/4 0k bovine a-lactalbumin 
antiserum 100 ml per well (in 0.05M carbonate buffer, pH 
9.6) on Nunc-Immuno Plate IF MaxiSorp. 

2. Wash 4x with wash buffer (0.025% Tween 

25 20 in PBS pH 7.2) 

3. Add 50 ml assay buffer (0.04M MOPS, 
0.12M NaCl, 0.01M EDTA, 0.1% gelatin, 0.05% Tween 20, 
0.005% chlorhexidine digluconate, Leupeptin 50 mg/ml, pH 
7.4) . 

30 4. Add 50 ml of standards and samples (in 

assay buffer) in triplicate. 

5. Add 50 ml l/100k diluted a-lactalbumin 
biotin conjugate. 

6. Incubate overnight at 4C 
35 7. Wash 4x with wash buffer 

8. Add 100 ml l/10k assay buffer diluted 
ExtrAvidin-peroxidase (Sigma) . Incubate 2 hours at RT. 
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9. Wash 4x twice with wash buffer. 

10. Add 125 ml fresh substrate buffer (200 
ml tetramethylbenzidine 20 mg/ml) DMSO, 64 ml 0.5M 
hydrogen peroxide, 19.74 ml sodium acetate, pH 4.8). 

5 11. Incubate for 12 minutes at RT. 

12. Add 50 ml 0.5M sulfuric acid to stop 
substrate reaction. 

13. Read absorbance at 450 ma minus 600 nm 

in an EIA autoreader. 

0 Bovine a-lactalbumin was present at a 

concentration of levels up to and beyond 1.0 mg/ml mouse 
milk. Expression was determined by Western Blotting in 
the following steps. The 14% PAGE gel was transfered to an 
Immobilon-P membrane (Millipore) , which was blocked in 

5 0.02 M sodiumphosphate, 0.12M NaCl, 0.01% gelatin, 0.05% 

Tween 20/ pH=7.2, and incubated with anti-bovine 
a-lactalbumin (1/2000 dilution) for 2 hours at room 
temperature. The gel was washed twice (2 min.) with an 
ELISA wash buffer and incubated with goat anti-rabbit 

0 IgG-HRP for 2 hours at room temperature, followed by 

washing 3 times with a wash buffer and washing once with 
double-distilled water. The gel was placed in a substrate 
solution (25 mg 3,3 1 -diaminobenzidine, 1 ml 1% CoCl 2 in 
H 2 O r 49 ml PBS pH 7.4 and 0.05 ml 30% H 2 0 2 ) and monitored 

5 for color development. The membrane was air dried. 

It is understood that the invention is not 
confined to the particular constructions and arrangements 
herein illustrated and described, but embraces such 
modified forms thereof as come within the scope of the 

0 claims following the bibliography. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT : BLECK, GREGORY T. 

BREMEL, ROBERT D. 

5 (ii) TITLE OP INVENTION: DNA SEQUENCE ENCODING BOVINE 

ALPHA-LACT ALBUMIN AND METHODS OF USE 

(iii) NUMBER OF SEQUENCES : 20 

(iv) CORRESPONDENCE ADDRESS : 
10 (A) ADDRESSEE: ANDRUS, SCEALES, STARKE & SAWALL 

(B) STREET: 100 E. WISCONSIN AVE. , SUITE 1100 

(C) CITY: MILWAUKEE 

(D) STATE: WI 

(E) COUNTRY: USA 

15 (F) ZIP: 53202-4178 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS / MS-DOS 

20 (D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

25 (viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Sara, Charles S 

<B) REGISTRATION NUMBER: 30,492 

(C) REFERENCE / DOCKET NUMBER: F- 3262-1 

(ix) TELECOMMUNICATION INFORMATION: 
30 (A) TELEPHONE: (608) 255-2022 

(B) TELEFAX: (608) 255-2182 

(C) TELEX: 26832 AND STARK 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

40 (xL) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGACCATGA TTACGAATTC ATCGTA 26 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 88 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 2s 
GAACAGTTAT CTAGATCTCG AGCTCGCGAA AGCTTGCATG CCTGCAGGTC GACTCTAGAG 
GATCCCCGGG TACCGAGCTC GAATTCAC 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2044 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

fix) FEATURE: 

(A) NAME /KEY : signal peptide coding region 

(B) LOCATION: 1943.. 2043 

(iX> ^a^NAME /KEY : inherited control region for a-lactalbumin 

(B) LOCATION: 1966 

(A) NAME /KEY : putative steroid response element 

(B) LOCATION: 1433.. 1446 

(ix) F ^J D ^ £E y KEy . una polymerase binding region 
(B) LOCATION: 1961.. 1978 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
GATCAGTCCT GGGTGGTCAT TGAAAGGACT GATGCTGAAG TTGAAGCTCC AATACTTTGG 
CCACCTGATG CGAAGAACTG ACTCATGTGA TAAGACCCTG ATACTGGGAA AGATTGAAGG 
CAGGAGGAGA AGGGATGACA GAGGATGGAA GAGTTGGATG GAATCACCAA CTCGATGGAC 
ATGAGTTTGA GCAAGCTTCC AGGAGTTGGT AATGGGCAGG GAAGCCTGGC GTGCTGCAGT 
CCATGGGGTT GCAAAGAGTT GGACACTACT GAGTGACTGA ACTGAACTGA TAGTGTAATC 
CATGGTACAG AATATAGGAT AAAAAAGAGG AAGAGTTTGC CCTGATTCTG AAGAGTTGTA 
GGATATAAAA GTTTAGAATA CCTTTAGTTT GGAAGTCTTA AATTATTTAC TTAGGATGGG 
TACCCACTGC AATATAAGAA ATCAGGCTTT AGAGACTGAT GTAGAGAGAA TGAGCCCTGG 
CATACCAGAA GCTAACAGCT ATTGGTTATA GCTGTTATAA CCAATATATA ACCAATATAT 
TGGTTATATA GCATGAAGCT TGATGCCAGC AATTTGAAGG AACCATTTAG AACTAGTATC 
CTAAACTCTA CATGTTCCAG GACACTGATC TTAAAGCTCA GGTTCAGAAT CTTGTTTTAT 
AGGCTCTAGG TGTATATTGT GGGGCTTCCC TGGTGGCTCA GATGGTAAAG TGTCTGCCTG 
CAATGTGGGT GATCTGGGTT CGATCCCTGG CTTGGGAAGA TCCCCTGGAG AAGGAAATGG 
CAACCCACTC TAGTACTCTT ACCTGGAAAA TTCCATGGAC AGAGGAGCCT TGTAAGCTAC 



60 
88 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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10 



15 



20 



AGTCCATGGG ATTGCAAAGA 


GTTGAACACA 


»/«iv^KAriiiitri &r*^f2 l l*Jlf?Jlf5 , P 
ACTGAGCAAC TAAVtUAwiu^ AWiuJLAWlwl 


900 


ATACACCTGT GAGGTGAAGT 


GAAGTGAAGG 


TTCAATGGAG GGTuiLLXuU fi± IbUiwAnn 


960 


GATTCTTTAC CATCTGAGCC 


ACCAGGGAAG 


CCCAAGAATA CTGGAGruw* i.iu»w^.XiiX xv 


1020 


CTTCTCCAGG GGATCTTCCC 


ATCCCAGGAA 


TTGAACTGGA GTCTCCTGCA TTTCAGGTGG 


i nnrt 


ATTCTTCACC AGCTGAACTA 


CCAGGTGGAT 


ACTACTCGAA TATTAAAGTG CTTAAAGTCC 


1 1 A f\ 


AGTTTTCCCA CCTTTCCCAA 


AAAGGTTGGG 


TCACTCTTTT TTAACCTTCT GTGGCCTACT 


lzUU 


CTGAGGCTGT CTACAAGCTT 


ATATATTTAT 


GAACACATTT ATTGCAAGTT GTTAGTTTTA 


1260 


GATTTACAAT GTGGTATCTG 


GCTATTTAGT 


GGTATTGGTG GTTGGGGATG GGGAGGCTGA 




TAGCATCTCA GAGGGCAGCT 


AGATACTGTC 


ATACACACTT TTCAAGTTCT CCATTTTTGT 


1JOU 


GAAATAGAAA GTCTCTGGAT 


CTAAGTTATA 


- in mini run rt ^ #^<tt J ■! f»# HIM 1 IIW/1 Olll^lt IPK 

TGTGATTCTC AGTCTCTGTG GTGATATTUx 




ATTCTACTCC TGACCACTCA 


ACAAGGAACC 


AAGATATCAA GGGACACTTG TTTTGTTTCA 


13UU 


TGCCTGGGTT GAGTGGGCCA 


TGACATATGA 


TGATGTACAG TCCTTTTCCA TATTCTGTAT 


IboU 


GTCTCTAAGA GGAAGGAGGA 


GTTGGCCGTG 


GACCCTTTGT GCATTTTCTG ATTGCTTCAC 


1620 


TTGTATTACC CCTGAGGCCC 


CCTTTGTTCC 


TGAAATAGGT TGGGCACATC TTGCTTCCTA 


1680 


GAACCAAGAC TACCAGAAAC 


AACATAAATA 


AAGCCAAATG GGAAACAGGA TCATGTTTGT 


1740 


AACACTCTTT GGGCAGGTAA 


CAATACCTAG 


TATGGACXAG AGATTCTGGG GAGGAAAGGA 


1800 


AAAGTGGGGT GAAATTACTG 


AAGGAAGCTC 


AATGTTTCTT TGTTGGTTTT ACTGGCCTCX 


1860 


CTTGTCATCC TCTTCCTGGA 


TGTAAGGCTT 


GATGCCAGGG CCCCTAAGGC TTTTTCCACA 


1920 


AATAAAAGGA GGTGAGCAGT 


GTGGTGACCC 


CATTTCAGAA TCTTGAGGGG TAACCAAAAT 


1980 


GATGTCCTTT GTCTCTCTGC 


TCCTGGTAGG 


CATCCTATTC CATGCCACCC AGGCTGAACA 


2040 


GTTA 






2044 



(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 14 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
30 CATATTCTAT TCTA 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CATATTCTAT TCCTA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CATATTCTAT TTCTA 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TCTTGAGGGG TAACCAAA 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TCTTGGGGGT AGCCAAA 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 9: 



15 



15 



18 



17 
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TCTTGGGGGG TCACCAAA 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 43 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
10 ACGCTTGTAA AACGACGGCC AGTTGATTCT CAGTCTCTGT GGT 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(i±) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AGCATCAGGA AACAGCTATG ACCTGGGTGG CATGGAATAG GAT 
20 (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CTCTTCCTGG ATGTAAGGCT T 
(2) INFORMATION FOR SEQ ID NO: 13: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TCCTGGGTGG TCATTGAAAG GACT 
(2) INFORMATION FOR SEQ ID NO: 14: 



24 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 ba9e pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAATGTGGTA TCTGGCTATT TAGTG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY t linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

20 

AGCCTGGGTG GCATGGAATA 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

20 

GAAACGCGGT ACAGACCCCT 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
AGGAAGCTCA ATGTTTCTTT GTTGGTTTTA CTGGCCTCTC TTGTCA 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



AGGAGGCTAT TCTTTCCTTT TAGTCTATAC TGTCTTCGCT CTTCA 



45 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TATAAGAAAT CAGGCTTTAG AGACTGATGT AGAGAGAATG AGCCCTGGCA TACCAGAAGC 60 
TAACAGCTA - 69 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TCTCAGAAAT CACACTTTTT TGCCTGTGGC CTTGGCAACC AAAAGCTAAC ACATA 55 
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CLAIMS 

What is claimed is: 
lm a mammary specific DNA sequence encoding 

bovine a-lactalbumin and promoting quantitative 
differences i n gene expression among mammals, wherein the 
DNA sequence is characterized by variations in the gene 
5 structure in the control region of bovine a-lactalbumin. 

2m The DNA sequence of claim 1 wherein one of 

the variations is in the -13 position of the DNA 
sequence. 

3# The DNA sequence of claim 2 wherein the -13 

position is occupied by adenine. 

4w The DNA sequence of claim 1 comprising the 

following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

5 AGCTAACAGCTA . 

5 ^ The DNA sequence of claim 1 comprising the 

following DNA sequence (SEQ ID NO: 17) in the control 
region of bovine a-lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGG - 

6 ^ The DNA sequence of claim 1 comprising the 

DNA sequence listed in Fig. 5 (SEQ ID NO: 3) in the 

control region of bovine a-lactalbumin. 

7m An expression vector comprising the DNA 

sequence of claim 1. 

Q^ An expression system comprising a mammary 

specific a-lactalbumin control region construct which 
when genetically incorporated into a mammal permits the 
female species of that mammal to produce the desired 
5 recombinant protein in its milk. 

9m The expression system of claim 8 which 

comprises at least one a-lactalbumin control region 
construct operatively linked to a DNA sequence coding for 
a signal peptide and a-lactalbumin. 
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10. The expression system of claim 8 which 
comprises a 3 1 flanking region downstream of the DNA 
sequence coding for a-lactalbumin . 

11. The expression system of claim 10 wherein the 
construct includes a 5» flanking region upstream of the 
DNA sequence coding for the signal peptide. 

12. The expression system of claim 8 wherein the 
construct comprises a 5 1 a-lactalbumin flanking region 
attached to a bovine B-casein gene. 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylation site of B-casein 
and approximately 100 base pairs of 5 f a-lactalbumin 
flanking region. 

14. The expression system of claim 12 wherein the 
construct includes the proximal promoter from a first 
milk protein and the distal control region of a second 
milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the 
group consisting of a-lactalbumin, B-casein, asj-casein, 
as 2 -casein and Jt-casein. 

16. The expression system of claim 12 wherein the 
construct includes the proximal promoter of B-casein and 
the distal control region of a-lactalbumin. 

17. a genetically engineered mammal characterized 
by an expression system comprising an a-lactalbumin 
control region operatively linked to an exogenous DNA 
sequence coding for a desired protein to be expressed in 

5 milk through a DNA sequence coding for a signal peptide 

effective in secreting and maturing the protein in 
mammary tissue. 

18. The genetically engineered mammal of claim 17 
wherein the a-lactalbumin control region includes a 
mammary specific DNA sequence encoding bovine a- 
lactalbumin and having the following nucleotide sequence 

5 (SEQ ID NO: 19) in the control region of bovine a- 

lactalbumin: 
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TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

AGCTAACAGCTA. 

The genetically engineered mammal of claim 17 
wherein the a-lactalbumin control region includes a 
mammary specific DNA sequence encoding bovine a- 
lactalbumin and having the following nucleotide sequence 
(SEQ ID NO: 20) in the control region of bovine a- 
lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACT • 

20. Products produced by the genetically 
engineered mammal of claim 17. 

21. Semen produced by the genetically engineered 
mammal of claim 17 . 

22. Milk produced by the genetically engineered 
mammal of claim 17 . 

23. a transgenic mammal of claim 17. 

24. A DNA sequence coding for a-lactalbumin which 
is operatively linked in an expression system of a 
mammary specific a-lactalbumin protein control region, or 
any control region which specifically activates a- 
lactalbumin in milk or in mammary tissue, through a 
signal peptide that permits secretion and maturation of 
the a-lactalbumin in the mammary tissue. 

25. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 
AGCTAACAGCTA. 

26. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 20) in the control 
region of bovine a-lactalbumin: 

AGGAAG CTCAAT GTT T CTT TGTTGGTTTTACTGGCCTCTCTTGTGA . 

27. The DNA sequence of claim 24 comprising the 
DNA sequence (SEQ ID NO: 3) listed in Fig. 5 in the 
control region of bovine a-lactalbumin. 

28. A process for genetically engineering the 
incorporation of one or more copies of a construct 
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comprising a a-lactalbumin control region or any control 
region sequence specifically activated in mammary tissue, 
5 operatively linked to a DNA sequence coding for a desired 

recombinant protein through a DNA sequence coding for a 
signal peptide that permits the secretion and maturation 
of a-lactalbumin in the mammary tissue 

29. The process of claim 28 wherein the construct 
is genetically incorporated into a mammal and the 
recombinant protein product is subsequently expressed and 
secreted into or along with the milk of the lactating 

5 genetically engineered mammal. 

30. The process of claim 29 wherein the construct 
is generally incorporated in mammalian embryos, mammalian 
mammary glands, or mammalian stem cells. 

31. The process of claim 28 wherein the mammals 
are cows; sheep, goats, mice, oxen, camels, water 
buffaloes, llamas and pigs. 

32. A process for the production and secretion 
into mammal's milk of an exogenous recombinant protein 
comprising the steps of: 

a. producing milk in a genetically 
5 engineered mammal characterized by an 

expression system comprising a-lactalbumin 
control region operatively linked to an 
exogenous DNA sequence coding for the 
recombinant protein through a DNA sequence 
10 coding for a signal peptide effective in 

secreting and maturing the recombinant 
protein in mammary tissue; 

b. collecting the milk; and 

c. isolating the exogenous recombinant 
15 protein from the milk. 

33. The process according to claim 32, wherein 
said expression system also includes a 3 1 flanking region 
coding for a-lactalbumin downstream of the DNA sequence. 

34. The process according to claim 32, wherein 
said expression system also includes a 5* flanking region 
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coding for the signal peptide upstream of the DNA 
sequence . 

35. a selection characteristic for identifying 
superior milk producing mammals comprising a DNA sequence 
encoding bovine a-lactalbumin and having the DNA sequence 
(SEQ ID NO: 3) listed in Fig. 5 in the control region of 
bovine a-lactalbumin. 

36. A selection characteristic for identifying 
superior milk producing mammals comprising inherited 
genetic material which is a mammary specific DNA sequence 
encoding bovine a-lactalbumin and promoting quantitative 
differences in gene expression among mammals, wherein the 
DNA sequence is characterized by variations in the gene 
structure in the control region of bovine a-lactalbumin. 

37. The selection characteristic of claim 36 
wherein one of the variations is in the -13 position of 
the DNA sequence. 

38. The selection characteristic of claim 37 
wherein the -13 position is occupied by adenosine. 

39. The selection characteristic of claim 36 
comprising the following DNA sequence (SEQ ID NO: 19) in 
the control region of bovine a-lactalbumin: 
TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAG^ 

AGCTAACAGCTA . 

40. The selection characteristic of claim 36 
comprising the following DNA sequence (SEQ ID NO: 20) in 
the control region of bovine a-lactalbumin: 
AGGAAGCTCAATGTTTCTTTGTTGGTT 

42 # The selection characteristic of claim 36 

comprising the DNA sequence (SEQ ID NO: 3) listed in Fig. 
5 in the control region of bovine a-lactalbumin. 
42. A method of predicting superior milk and milk 

protein production in mammals comprising comparing 
selected positions on the DNA sequence of the inherited 
control region for a-lactalbumin in a subject ma m m a ml 
with analogous positions on the DNA sequence of the 
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control region for a-lactalbumin from mammals known for 
superior milk and milk protein production* 

43. The method of claim 42 wherein one of the 
selected positions is the -13 position on the control 
region of the DNA sequence. 

44. The method of claim 43 wherein the -13 
position is occupied by the base adenine. 

45. The method of claim 42 wherein the selected 
DNA sequence comprises a steroid response element. 
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AMENDED CLAIMS 

[received by the International Bureau on 14 December 1992 (14.12.92); 
original claims 2,5, 9-11, 19, 24-31,33, 35-45 cancelled; 
original claims 1,3,6,8,12,13,15,17,18,20-23,32 amended; 
other claims unchanged (4 pages) 3 

1. An isolated DNA sequence which promotes mammary 
specific expression of mRNA in mammary cells of lactating animals 
comprising a variant of bovine a-lactalbumin 5' flanking region 
regulatory sequence wherein the variant is in the -13 position 
from the start of the signal peptide coding region. 

3. The DNA sequence of claim 1 wherein the -13 
position is occupied by adenine. 

4 . The DNA sequence of claim 1 comprising DNA sequence 
(SEQ ID NO: 19) in the control region of bovine a-lactalbumin: 
TATAAGAAATC^GGCTTTAGAGACTGATGT^ 

ACA6CTA. 



sequence listed in Fig. 5 (SEQ ID NO: 3) as the 5' flanking region 
of bovine a-lactalbumin . 

7. An expression vector comprising the DNA sequence 
of claim 1. 

8. An expression system comprising a mammary specific 
a-lactalbumin control region construct operatively linked to a 
DNA sequence coding for a signal peptide, which when genetically 
incorporated into a non-human mammal permits the female species 
of that mammal to produce the desired recombinant protein in its 
milk. 



6. 



The DNA sequence of claim 1 comprising the DNA 
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12. The expression system of claim 8 wherein the 
construct comprises a 5' a-lactalbumin flanking region attached 
to a bovine B-casein DNA sequence. 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylation site of B-casein 5' 
flanking region. 

14. The expression system of claim 12 wherein the 
construct includes the proximal promoter from a first milk 
protein and the distal control region of a second milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the group 
consisting of a-lactalbumin, B-lactoglobulin, B-casein, asj- 
casein, crs^casein and JC-casein. 

16. The expressions system of claim 12 wherein the 
construct includes the proximal promoter of B-casein and the 
distal control region of a-lactalbumin. 

17. A transgenic non-human mammal containing an 
expression system comprising the DNA sequence listed in Fig. 5 
(SEQ ID NO: 3) as the a-lactalbumin control region operatively 
linked to an exogenous DNA sequence coding for a desired protein 
to be expressed in milk through a DNA sequence coding for a 
signal peptide effective in secreting and maturing the protein 
in mammary tissue. 

18. The transgenic non-human mammal of claim 17 
wherein the a-lactalbumin control region includes a mamm a r y 
specific DNA sequence encoding bovine a-lactalbumin and having 
the following nucleotide sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 
TATAAGAAATCAGGCTTTAGAGACTGATGTAGAG 

ACAGCTA. 
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20. Products produced by the genetically engineered 

non-human mammal of claim 17. 

21. Semen produced by the transgenic non-human mammal 

of claim 17. 

22. Milk produced by the transgenic non-human mammal 
of claim 17. 

23. A transgenic non-human mammal of claim 17. 
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32. A process for the production and secretion into 
non-human mammal's milk of an exogenous recombinant protein 
comprising the steps of : 

a. producing milk in a genetically engineered mamma l 
containing an expression system comprising a 

specific a-lactalbumin control region 
operatively linked to a DMA sequence coding for 
a signal peptide, which is effective in secreting 
and maturing the recombinant protein in mammary 
tissue; 

b. collecting the milk; and 

c. isolating the exogenous recombinant protein from 
the milk. 

34. The process according to claim 32, wherein said 
expression system also includes a 5' flanking region coding for 
the signal peptide upstream of the DNA sequence. . 
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